Introduction

To answer the research question of “What is the effect over time of music industry influence on musical artists, in terms of musical content?” measures of popularity, complexity, and outside influence to an artist should be measured over time. These measures are subjective, but justifiable and based in reason from the data. Each created metric will be made for each song of a specific case study artist for which there is valid data. Ultimately the goal will be to compare each of these metrics for all of a case study artsit’s songs which charted at some point on the Billboard Hot 100 and plot each of these against the song’s release date.

Popularity will be measured using the data on a song’s life and behavior when it was on the Billboard Hot 100. Multiple metrics will be proposed and compared. The complexity of a song will be measured by combining musical complexity (if there exists data on it) and lyrical complexity. Finally outside influence will be measured primarily by the number of writers on the song who were non-band members.

All the functions will be defined in a separate source file, and then called in this file. All of the data used here should have already been preprocessed in a previous file.

options(kableExtra.auto_format = FALSE)
library(ggrepel)
library(htmltools)
library(tidyverse)
library(tidytext)
library(data.table)
library(plyr)
library(quanteda)
library(kableExtra)
library(knitr)
library(gridExtra)
library(formattable)
library(psych)
library(PerformanceAnalytics)

source("frostFunctions.R")
billboardDf = read_csv("FrostData/billboardDataClean.csv", col_types = cols())
spotifyDf = read_csv("FrostData/spotifyDataClean.csv", col_types = cols())
riaaDf = read_csv("FrostData/riaaDataClean.csv", col_types = cols())
grammyDf = read_csv("FrostData/grammyDataClean.csv", col_types = cols())
songSecsDf = read_csv("FrostData/songSectionDataClean.csv", col_types = cols())
songAttrsDf = read_csv("FrostData/songAttrsDataClean.csv", col_types = cols())

Join all Data for an Artist

Here the functionality will be built to join all data on a chosen artist. For the examples to follow, the band “Maroon 5” will be used.

archArtist = artistDataJoiner("Maroon 5")
validAlbums = c("Red Pill Blues + (Deluxe)", "v (Deluxe)", "    Overexposed Track by Track", "Hands all over (Deluxe)", "it Won't be Soon Before Long.", "Songs About Jane")

archArtist = filter(archArtist, Album %in% validAlbums)
#archArtist

Measure of Popularity

Quantifying popularity will be an done in multiple ways to account for imperfections about each metric. There will be multiple popularity metrics, and they can be compared and contrasted across songs. They are as follows: pop1 = sum(1/current * weeks) pop2 = sum(1/current) pop3 = ln(101.1- min(peak)) pop4 = mean(ln(101.1 - current))

Pop1 is a metric which rewards songs which reach their peak on the charts later in their lifetime on the charts so due to this it discrimates against tracks which peak right away and disipate quickly. Pop2 is a metric which does not have an appropriate scale, as having the 2nd spot on the Hot 100 is half as valuable as the number 1 spot. Pop3 only considers the peak position on the chart, but does scale it more appropriately than the first 2. Pop4 uses the natural log scale to more appropriately consider differences in chart position, and takes the mean of all the log chart positions to account for both longevity and position.

archArtistPop = getPopularityMetric(archArtist)
#archArtistPop

ggplot(archArtistPop, aes(x = ReleaseDate, y = pop1)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Pop1 of Maroon 5 Songs By Release Date")

ggplot(archArtistPop, aes(x = ReleaseDate, y = pop2)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Pop2 of Maroon 5 Songs By Release Date")

ggplot(archArtistPop, aes(x = ReleaseDate, y = pop3)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Pop3 of Maroon 5 Songs By Release Date")

ggplot(archArtistPop, aes(x = ReleaseDate, y = pop4)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Pop4 of Maroon 5 Songs By Release Date")

write_csv(archArtistPop, "maroon5_pop.csv")

By each popularity metric made, there is indication of a slight increase in the level of popularity in the Maroon 5’s charting songs as time increases. In pop1 and pop2, a high leverage point likely has some influence on the exact slope of the best fit line.

Measure Outside Influence

To consider how much outside of influence was given in the creation of a song, counting the number of writers of the song who are not the artist themselves.

maroon5Members= c("Adam Levine", "Jesse Carmichael", "Mickey Madden", "James Valentine", "Matt Flynn", "PJ Morton", "Sam Farrar", "Ryan Dusick")

archArtistInfluence = getOutsideInfluenceScore(archArtist, maroon5Members)
#archArtistInfluence

ggplot(archArtistInfluence, aes(x = ReleaseDate, y = nonBandMemberWriters)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Number of non-Band Member Writers on Maroon 5's Songs Over Time")

It is apparant that all of Maroon 5’s Billboard Hot 100 charting songs were written completely and solely by members of Maroon 5 until just before 2015. After this point, all the charting songs have multiple writing credits given to writers not in Maroon 5. This indicates increased outside influence in the band’s later music.

Measure of Lyrical Complexity

Some further preprocessing will be done to tidy the lyric data. Then the number of total words and unique non stop words will be counted, and the number of unique words divided by the total number of words will be used as a metric to give some measure of lyrical repetition in the song. Furthermore, the average word length in the song will be recorded, as well the number of words divdided by the number of seconds in the song to get the words per second. Repetition, or the the number of unique words divided by the total number of words, will be considered most important and thus weighed most heavily. The average length of words in the song will be considered second most important and weighed just below the measure of repetition, and the average number of syllables in each word in the song as well as the number of words per second will be weighed the lightest.

lyricalComplexDf = getLyricalComplexity(archArtist, TRUE)
## Joining, by = "word"
#lyricalComplexDf


ggplot(lyricalComplexDf, aes(x = ReleaseDate, y = lyricalComplexity)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Lyrical Complexity of Maroon 5 Songs by Release Date")

When the lyrical complexity of Maroon 5’s Billboard charting songs are plotted against time, it is aparrent that there is a negative assosication between lyrical complexity and time for Maroon 5 charting songs. By far the least lyrically complex song was one of the more recent, and released around 2017. While this is worth noting, it is should also be said that this point and a point around 2002 given a very high score are both arguably high leverage points.

Measure of Musical Complexity

Previously, the music data was held for each section of each song, but it will need to be aggregated to each song. Now for each song measures of the number of unique chords, non-diatonic chords, extended chords, number of sections, and the number of section ends that are different will be held. It should be noted that not all songs will have music chord data, so these units will recieve 0 for musical complexity after the present complexity levels are standardized. This is so these songs will not affect the total complexity of a song which will be calculated later using the standardized lyrical and musical complexities.

The musical complexity score is computed by weighing the number of non-ditonic chords, or chords outside of the key the song is in that are not expected to be heard, and the numebr of unique chords in the song more than the number of extended chords and the number of sections which are different, as these are argueably less difinitive measures of musical complexity.

musicComplexDf = getMusicComplexity(archArtist, TRUE)
#musicComplexDf

ggplot(musicComplexDf, aes(x = ReleaseDate, y = musicalComplexity)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Musical Complexity of Maroon 5 Songs By Release Date")

There is less music chord data to go off of, but what is there shows Maroon 5 as scoring lower muiscal complexity scores in their later released songs in comparison to their earlier released material.

Now all of the smaller metric datasets will be joined and all of the columns other than the count of writers who are not in the band will be standardized.

artistMetricDf = fullMetricsDataSet(archArtistPop, archArtistInfluence, lyricalComplexDf, musicComplexDf, TRUE)
#artistMetricDf

#artistMetricDf %>%
#  select(Name, pop1, pop2, pop3, pop4) %>%
#  gather(key = "Metric", value = "Score", -Name) %>%
#  ggplot(aes(x = Name, y = Score, fill = Metric)) + geom_col(position = "dodge") + labs(title = "Comparison of Popularity Metrics Across Maroon 5 Billboard Hot 100 Songs") + theme(axis.text.x = element_text(angle = 90))

Applicable Functionality

Now that all of the metrtic data is collected along with the original data on the artist’s songs, their tracks can be compared directly to each other in some meaningful ways. First a function will be made to compare chosen tracks of a particular artist. The result is a few plots tracking each song’s life on the Billboard Hot 100, and a plot giving the track’s contribution to the pop1 metric each week. Then there are some formatted and color coded tables to summarize the metric data. Things shaded red are below the average when standardized and the green are above 0. This is run on a selection of Maroon 5 songs below.

#Can track pop1 metric over time because weeks is a changing metric
#Join all of the originality and complexity metrics because they are attatched to the song, not moving by week


tables = compareTracks(c("She will be loved", "Girls like you"), archArtist, artistMetricDf)

#tables = compareTracks(c("She will be loved", "Harder to Breathe", "Wait", "Sugar"), archArtist, artistMetricDf)

kable(tables[1]) %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Name ReleaseDate Label nonBandMemberWriters
Girls Like you 2018-05-30 222 Records / Interscope 6
She will be Loved 2002-06-25 A&m/Octone 0
as.data.frame(tables[2]) %>%
  mutate(pop1 = cell_spec(pop1, "html",color = ifelse(pop1 > 0,"green", "red")),
         pop2 = cell_spec(pop2, "html",color = ifelse(pop2 > 0,"green", "red")),
         pop3 = cell_spec(pop3, "html",color = ifelse(pop3 > 0,"green", "red")),
         pop4 = cell_spec(pop4, "html",color = ifelse(pop4 > 0,"green", "red")),) %>%
  kable("html", escape = FALSE) %>%
  kable_styling()
Name ReleaseDate pop1 pop2 pop3 pop4 GrammyAward RiaaStatus
Girls Like you 2018-05-30 3.18889267498101 3.0525303855611 0.783685927715491 1.02741452937264 NA 1x Platinum
She will be Loved 2002-06-25 0.192183699021031 0.148605952040361 0.703917878435716 0.892206551534005 NA 4x Multi-Platinum
as.data.frame(tables[3]) %>%
  mutate(totalComplexity = cell_spec(totalComplexity, "html",color = ifelse(totalComplexity > 0,"green", "red"))) %>%
  kable("html", escape = FALSE) %>%
  kable_styling()
Name ReleaseDate totalComplexity
She will be Loved 2002-06-25 0.465584251357378
Girls Like you 2018-05-30 -0.838057270738207
Girls Like you 2018-05-30 -0.941449625856669

In the comparison of an older and very popular charting song of Maroon 5’s called “She will be Loved” to a newer song of theirs that is indicated as their most commercially popular, “Girls Like you”, it is apparant that small differences in how the tracks performed on the charts had significant differences in their contributions to the pop1 metric. “Girls Like you” was on the chart for about 10 weeks longer which certainly was an indication of greater popularity rewarded with continual addition to the pop1 score, but also it maintained its peak spot on the charts at number 1 for much longer. “She will be Loved” climbed to its peak and fell out of the top 10 in the same number of weeks that “Girls Like you” maintained the top spot. This maintainence was heavily rewarded in greater and greater pop1 score additions each week, and is really the reason why “Girls Like you” received such a great value in that metric.

Supplementary to this, the tables indicate that “Girls Like you” had 6 outside writers whereas “She will be Loved” had none. Both songs had over-average popularity for Maroon 5 songs, but “Girls Like you” had greater scores across the metrics. What is note-worthy is that “She will be Loved” had an over-average complexity and “Girls Like you” was about 1 standard deviation under the average in complexity, with only a minor difference being made for the version that is more complex with a rap verse.

All of the previously created functionality should be able to be applied to any valid artist that there is available data on. The full pipeline of function calls is below. The input just requires the artist name, the names of those given the writing credits of a track who are representing the artist or group, and the valid albums which should be considered.

#Need to pass artist, and valid albs 

maroon5Metrics = completeArchDf("Maroon 5", c("Adam Levine", "Jesse Carmichael", "Mickey Madden", "James Valentine", "Matt Flynn", "PJ Morton", "Sam Farrar", "Ryan Dusick"), c("Red Pill Blues + (Deluxe)", "v (Deluxe)", "Overexposed Track by Track", "Hands all over (Deluxe)", "it Won't be Soon Before Long.", "Songs About Jane"), c(), TRUE) #May be 2 versions of girls like you - one with rap and one without
## Joining, by = "word"
singleArtistVisual("Maroon 5",maroon5Metrics)
## Warning: Removed 1 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_point).

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8623 -0.8262 -0.1167  0.4326  4.0794 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)   
## (Intercept)             5.2439757  1.8417639   2.847  0.00776 **
## fullMetric$ReleaseDate -0.0003467  0.0001210  -2.865  0.00742 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.18 on 31 degrees of freedom
## Multiple R-squared:  0.2094, Adjusted R-squared:  0.1839 
## F-statistic: 8.209 on 1 and 31 DF,  p-value: 0.007421

From the simple linear regressions of each of the three metrics against the release date of all of Maroon 5’s charting songs, it seems there was a moderate decrease in their song complexity, a slight increase in their song popularity, and drastic increase in outside influence to their music as time passed.

justinTimberlakeMetrics = completeArchDf("Justin Timberlake", c("Justin Timberlake"), c("Justified", "Man of the Woods", "The 20/20 Experience - 2 of 2 (Deluxe)", "The 20/20 Experience (Deluxe Version)", "Futuresex/Lovesounds Deluxe Edition"), c(), TRUE)
## Joining, by = "word"
singleArtistVisual("Justin Timberlake",justinTimberlakeMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4169 -0.6048  0.3002  0.7639  1.3309 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)             0.3010869  1.7359658   0.173    0.864
## fullMetric$ReleaseDate -0.0000184  0.0001143  -0.161    0.874
## 
## Residual standard error: 1.005 on 16 degrees of freedom
## Multiple R-squared:  0.001617,   Adjusted R-squared:  -0.06078 
## F-statistic: 0.02592 on 1 and 16 DF,  p-value: 0.8741

From the simple linear regressions of each of the three metrics against the release date of all of Justin Timberlake’s charting songs, it seems there was a insignificant change in his song complexity, a very slight decrease in his song popularity, and a slight increase in outside influence to his music as time passed.

twentyOnePilotsMetrics = completeArchDf("Twenty One Pilots", c("Tyler Joseph", "Josh Dun", "Nick Thomas", "Chris Salih"), c("Trench", "Blurryface","Vessel (with Bonus Tracks)", "Twenty One Pilots"), c("Cancer"), TRUE)#Cancer was a cover so it is excluded even though it made the chart
## Joining, by = "word"
singleArtistVisual("Twenty One Pilots", twentyOnePilotsMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3339 -0.6712 -0.1554  0.5499  1.6170 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)             5.2306247 11.5187697   0.454    0.666
## fullMetric$ReleaseDate -0.0003069  0.0006755  -0.454    0.666
## 
## Residual standard error: 1.062 on 6 degrees of freedom
## Multiple R-squared:  0.03326,    Adjusted R-squared:  -0.1279 
## F-statistic: 0.2064 on 1 and 6 DF,  p-value: 0.6656

From the simple linear regressions of each of the three metrics against the release date of all of Twenty One Pilot’s charting songs, it seems there was a very slight decrease in their song complexity, a slight decrease in their song popularity, and no change in outside influence to their music as time passed.

fooFightersMetrics = completeArchDf("Foo Fighters", c("Dave Grohl", "Nate Mendel", "Pat Smear", "Taylor Hawkins", "Chris Shiflett", "Rami Jaffee", "William Goldsmith", "Franz Stahl"), c("Wasting Light", "Echoes, Silence, Patience & Grace", "In your Honor", "One by One (Expanded Edition)", "There is Nothing Left to Lose", "The Colour and the Shape", "Concrete and Gold", "Foo Fighters", "Sonic Highways"), c(), TRUE)
## Joining, by = "word"
singleArtistVisual("Foo Fighters", fooFightersMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8654 -0.5308  0.1518  0.3827  0.7732 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)   
## (Intercept)            -7.6339632  2.0759554  -3.677  0.00624 **
## fullMetric$ReleaseDate  0.0005724  0.0001549   3.695  0.00608 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6447 on 8 degrees of freedom
## Multiple R-squared:  0.6306, Adjusted R-squared:  0.5844 
## F-statistic: 13.65 on 1 and 8 DF,  p-value: 0.006085

From the simple linear regressions of each of the three metrics against the release date of all of the Foo Fighter’s charting songs, it seems there was a significant increase in their song complexity, a significant decrease in their song popularity, and no change in outside influence to their music as time passed.

taylorSwiftMetrics = completeArchDf("Taylor Swift", c("Taylor Swift"), c("Reputation", "1989 (Deluxe Edition)", "Red (Deluxe Edition)", "Speak Now (Deluxe Edition)", "Fearless (Platinum Edition)", "Taylor Swift"), c(), TRUE )
## Joining, by = "word"
singleArtistVisual("Taylor Swift",taylorSwiftMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.32948 -0.77178  0.09415  0.71349  2.41927 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)             2.6341690  1.8710342   1.408    0.164
## fullMetric$ReleaseDate -0.0001717  0.0001216  -1.412    0.163
## 
## Residual standard error: 1.173 on 69 degrees of freedom
## Multiple R-squared:  0.02807,    Adjusted R-squared:  0.01399 
## F-statistic: 1.993 on 1 and 69 DF,  p-value: 0.1625

From the simple linear regressions of each of the three metrics against the release date of all of Taylor Swift’s charting songs, it seems there was slight decrease in her song complexity, a very slight increase in her song popularity, and a significant increase in outside influence to her music as time passed.

justinBieberMetrics = completeArchDf("Justin Bieber", c("Justin Bieber"), c("Purpose (Deluxe)", "Journals", "Believe (Deluxe Edition)", "under the Mistletoe (Deluxe Edition)", "My World 2.0", "Never Say Never - The Remixes", "My World"), c(), TRUE)
## Joining, by = "word"
singleArtistVisual("Justin Bieber",justinBieberMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.21817 -0.51190  0.03743  0.53507  2.66832 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)            -1.249e+00  2.484e+00  -0.503    0.617
## fullMetric$ReleaseDate  7.961e-05  1.580e-04   0.504    0.617
## 
## Residual standard error: 1.026 on 51 degrees of freedom
## Multiple R-squared:  0.004953,   Adjusted R-squared:  -0.01456 
## F-statistic: 0.2538 on 1 and 51 DF,  p-value: 0.6166

From the simple linear regressions of each of the three metrics against the release date of all of Justin Bieber’s charting songs, it seems there was a very slight increase in his song complexity, a slight increase in his song popularity, and an insignificant change in outside influence to his music as time passed.

britneySpearsMetrics= completeArchDf("Britney Spears", c("Britney Spears"), c("Britney Jean (Deluxe Version)", "Femme Fatale (Deluxe Version)", "Circus (Deluxe Version)", "Blackout", "In the Zone", "Britney (Digital Deluxe Version)", "Oops!... i Did it Again", "...baby One more Time (Digital Deluxe Version)", "Glory (Deluxe Version)") ,c(), TRUE)
## Joining, by = "word"
singleArtistVisual("Britney Spears",britneySpearsMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.37977 -0.46568  0.09747  0.77227  1.30747 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)            -2.3201364  1.6363225  -1.418    0.168
## fullMetric$ReleaseDate  0.0001655  0.0001160   1.427    0.165
## 
## Residual standard error: 0.982 on 27 degrees of freedom
## Multiple R-squared:  0.07011,    Adjusted R-squared:  0.03567 
## F-statistic: 2.036 on 1 and 27 DF,  p-value: 0.1651

From the simple linear regressions of each of the three metrics against the release date of all of Taylor Swift’s charting songs, it seems there was moderate increase in her song complexity, a very slight increase in her song popularity, and a moderate increase in outside influence to her music as time passed.

jColeMetrics = completeArchDf("j Cole", c("j Cole"), c("Revenge of the Dreamers Iii", "Kod", "2014 Forest Hills Drive", "Cole World: The Sideline Story", "4 your Eyez Only", "Born Sinner", "The Blow Up"), c(), TRUE)
## Joining, by = "word"
singleArtistVisual("j Cole", jColeMetrics)

## 
## Call:
## lm(formula = fullMetric$totalComplexity ~ fullMetric$ReleaseDate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.81566 -0.63115  0.07773  0.66597  1.92082 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)  
## (Intercept)            -7.0553501  3.1151674  -2.265   0.0293 *
## fullMetric$ReleaseDate  0.0004128  0.0001821   2.267   0.0291 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9508 on 38 degrees of freedom
## Multiple R-squared:  0.1192, Adjusted R-squared:  0.096 
## F-statistic: 5.141 on 1 and 38 DF,  p-value: 0.02913

From the simple linear regressions of each of the three metrics against the release date of all of J Cole’s charting songs, it seems there was moderate increase in his song complexity, a moderate decrease in his song popularity, and a decrease in outside influence to his music as time passed.

Statistical Arguements

Testing Relationships Between Variables

maroon5Metrics$label2 = maroon5Metrics$ReleaseDate > as.Date("2014-01-01")
maroon5Metrics$label2
##  [1]  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
## [12] FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [23] FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE

Popularity vs Complexity

summary(lm(pop1~totalComplexity,data = maroon5Metrics))
## 
## Call:
## lm(formula = pop1 ~ totalComplexity, data = maroon5Metrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.5810 -0.5374 -0.4279  0.0819  3.4326 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)
## (Intercept)     -6.211e-17  1.768e-01   0.000    1.000
## totalComplexity -1.806e-02  1.375e-01  -0.131    0.896
## 
## Residual standard error: 1.016 on 31 degrees of freedom
## Multiple R-squared:  0.0005561,  Adjusted R-squared:  -0.03168 
## F-statistic: 0.01725 on 1 and 31 DF,  p-value: 0.8964
summary(lm(pop1~ totalComplexity*label2,data= maroon5Metrics))
## 
## Call:
## lm(formula = pop1 ~ totalComplexity * label2, data = maroon5Metrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2140 -0.3502 -0.3022  0.1995  2.9513 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                -0.211195   0.213769  -0.988   0.3313  
## totalComplexity            -0.001883   0.147585  -0.013   0.9899  
## label2TRUE                  0.696501   0.388026   1.795   0.0831 .
## totalComplexity:label2TRUE  0.192628   0.398698   0.483   0.6326  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9963 on 29 degrees of freedom
## Multiple R-squared:  0.1005, Adjusted R-squared:  0.007442 
## F-statistic:  1.08 on 3 and 29 DF,  p-value: 0.373
#lm(pop1~totalComplexity*(totalComplexitybreaks[1]) + totalComplexity*(totalComplexity>=breaks[2]), data = artistMetricDf)

Number of Outside Writers vs Complexity

summary(lm(totalComplexity~nonBandMemberWriters,data = maroon5Metrics))
## 
## Call:
## lm(formula = totalComplexity ~ nonBandMemberWriters, data = maroon5Metrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7420 -0.9788 -0.1901  0.5998  4.9100 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)           0.30034    0.31373   0.957    0.346
## nonBandMemberWriters -0.14265    0.09875  -1.445    0.159
## 
## Residual standard error: 1.303 on 30 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.06503,    Adjusted R-squared:  0.03387 
## F-statistic: 2.087 on 1 and 30 DF,  p-value: 0.1589
summary(lm(totalComplexity~ nonBandMemberWriters*label2,data= maroon5Metrics))
## 
## Call:
## lm(formula = totalComplexity ~ nonBandMemberWriters * label2, 
##     data = maroon5Metrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8587 -0.9330 -0.1669  0.6297  4.9586 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)
## (Intercept)                      0.25179    0.34268   0.735    0.469
## nonBandMemberWriters            -0.08758    0.16934  -0.517    0.609
## label2TRUE                       0.36728    1.09195   0.336    0.739
## nonBandMemberWriters:label2TRUE -0.13376    0.28030  -0.477    0.637
## 
## Residual standard error: 1.344 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.07274,    Adjusted R-squared:  -0.02661 
## F-statistic: 0.7322 on 3 and 28 DF,  p-value: 0.5415

Number of Outside Writers vs Popularity

summary(lm(pop1~nonBandMemberWriters,data = maroon5Metrics))
## 
## Call:
## lm(formula = pop1 ~ nonBandMemberWriters, data = maroon5Metrics)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.18809 -0.48793 -0.17433  0.05925  2.93248 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)  
## (Intercept)          -0.34684    0.22707  -1.527   0.1371  
## nonBandMemberWriters  0.16899    0.07147   2.364   0.0247 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9434 on 30 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.1571, Adjusted R-squared:  0.129 
## F-statistic:  5.59 on 1 and 30 DF,  p-value: 0.02473
summary(lm(pop1~nonBandMemberWriters*label2,data = maroon5Metrics))
## 
## Call:
## lm(formula = pop1 ~ nonBandMemberWriters * label2, data = maroon5Metrics)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.26000 -0.40040 -0.18283  0.09901  2.87826 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)
## (Intercept)                     -0.338338   0.248568  -1.361    0.184
## nonBandMemberWriters             0.136980   0.122830   1.115    0.274
## label2TRUE                       0.001501   0.792059   0.002    0.999
## nonBandMemberWriters:label2TRUE  0.040855   0.203316   0.201    0.842
## 
## Residual standard error: 0.9746 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.1604, Adjusted R-squared:  0.07046 
## F-statistic: 1.783 on 3 and 28 DF,  p-value: 0.1732

Comparison Across Metrics

#This is for correlations between popularity metrics
#Should hope to have metrics which are strongly correlated with one another
corr.test(maroon5Metrics[c("pop1", "pop2", "pop3", "pop4")])
## Call:corr.test(x = maroon5Metrics[c("pop1", "pop2", "pop3", "pop4")])
## Correlation matrix 
##      pop1 pop2 pop3 pop4
## pop1 1.00 0.95 0.40 0.49
## pop2 0.95 1.00 0.47 0.55
## pop3 0.40 0.47 1.00 0.91
## pop4 0.49 0.55 0.91 1.00
## Sample Size 
## [1] 33
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##      pop1 pop2 pop3 pop4
## pop1 0.00 0.00 0.02 0.01
## pop2 0.00 0.00 0.01 0.00
## pop3 0.02 0.01 0.00 0.00
## pop4 0.00 0.00 0.00 0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option
chart.Correlation(maroon5Metrics[c("pop1", "pop2", "pop3", "pop4")])

#Need to change this to be 0
#The correlations between lyrical and musical complexity 
#Not really looking at strong correlation as a success, there can be music that is musically simple and lyrically very involved and complex
corr.test(maroon5Metrics[c("lyricalComplexity", "musicalComplexity")])
## Call:corr.test(x = maroon5Metrics[c("lyricalComplexity", "musicalComplexity")])
## Correlation matrix 
##                   lyricalComplexity musicalComplexity
## lyricalComplexity               1.0               0.4
## musicalComplexity               0.4               1.0
## Sample Size 
## [1] 33
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##                   lyricalComplexity musicalComplexity
## lyricalComplexity              0.00              0.02
## musicalComplexity              0.02              0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option
chart.Correlation(maroon5Metrics[c("lyricalComplexity", "musicalComplexity")])

Join Data for all Artists

maroon5Metrics$Artist = "Maroon 5"
maroon5Metrics = maroon5Metrics[,(names(maroon5Metrics) !="label2")]
justinTimberlakeMetrics$Artist = "Justin Timberlake"
justinBieberMetrics$Artist = "Justin Bieber"
twentyOnePilotsMetrics$Artist = "Twenty One Pilots"
britneySpearsMetrics$Artist = "Britney Spears"
jColeMetrics$Artist = "J Cole"
taylorSwiftMetrics$Artist = "Taylor Swift"
fooFightersMetrics$Artist = "Foo Fighters"

allArtistMetrics = rbind(maroon5Metrics, justinTimberlakeMetrics, justinBieberMetrics, twentyOnePilotsMetrics, britneySpearsMetrics, jColeMetrics, taylorSwiftMetrics, fooFightersMetrics)
allArtistMetrics

Now do the same statistical investigation as done for just Maroon 5.

summary(lm(pop1~totalComplexity,data = allArtistMetrics))
## 
## Call:
## lm(formula = pop1 ~ totalComplexity, data = allArtistMetrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8858 -0.4904 -0.3379 -0.0151  4.6874 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      0.003915   0.060688   0.065   0.9486  
## totalComplexity -0.100315   0.056221  -1.784   0.0755 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9823 on 260 degrees of freedom
## Multiple R-squared:  0.0121, Adjusted R-squared:  0.008297 
## F-statistic: 3.184 on 1 and 260 DF,  p-value: 0.07554
summary(lm(totalComplexity~nonBandMemberWriters,data = allArtistMetrics))
## 
## Call:
## lm(formula = totalComplexity ~ nonBandMemberWriters, data = allArtistMetrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7840 -0.7046  0.1309  0.6568  5.1107 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)           0.09968    0.09440   1.056    0.292
## nonBandMemberWriters -0.05002    0.03369  -1.485    0.139
## 
## Residual standard error: 1.081 on 259 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.008443,   Adjusted R-squared:  0.004615 
## F-statistic: 2.205 on 1 and 259 DF,  p-value: 0.1387
summary(lm(pop1~nonBandMemberWriters,data = allArtistMetrics))
## 
## Call:
## lm(formula = pop1 ~ nonBandMemberWriters, data = allArtistMetrics)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.9133 -0.4408 -0.3550 -0.0286  4.8756 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)          -0.08890    0.08602  -1.033    0.302
## nonBandMemberWriters  0.04796    0.03069   1.562    0.119
## 
## Residual standard error: 0.985 on 259 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.009338,   Adjusted R-squared:  0.005513 
## F-statistic: 2.441 on 1 and 259 DF,  p-value: 0.1194
corr.test(allArtistMetrics[c("pop1", "pop2", "pop3", "pop4")])
## Call:corr.test(x = allArtistMetrics[c("pop1", "pop2", "pop3", "pop4")])
## Correlation matrix 
##      pop1 pop2 pop3 pop4
## pop1 1.00 0.95 0.38 0.47
## pop2 0.95 1.00 0.42 0.48
## pop3 0.38 0.42 1.00 0.80
## pop4 0.47 0.48 0.80 1.00
## Sample Size 
## [1] 262
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##      pop1 pop2 pop3 pop4
## pop1    0    0    0    0
## pop2    0    0    0    0
## pop3    0    0    0    0
## pop4    0    0    0    0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option
chart.Correlation(allArtistMetrics[c("pop1", "pop2", "pop3", "pop4")])

allArtistMetricsSub = allArtistMetrics[allArtistMetrics$musicalComplexity != 0 ,]

corr.test(allArtistMetricsSub[c("lyricalComplexity", "musicalComplexity")])
## Call:corr.test(x = allArtistMetricsSub[c("lyricalComplexity", "musicalComplexity")])
## Correlation matrix 
##                   lyricalComplexity musicalComplexity
## lyricalComplexity              1.00              0.33
## musicalComplexity              0.33              1.00
## Sample Size 
## [1] 35
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##                   lyricalComplexity musicalComplexity
## lyricalComplexity              0.00              0.05
## musicalComplexity              0.05              0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option
chart.Correlation(allArtistMetricsSub[c("lyricalComplexity", "musicalComplexity")])

Artist Comparisons

Now artists will be compared to one another based on unstandardized values.

maroon5Reg = completeArchDf("Maroon 5", c("Adam Levine", "Jesse Carmichael", "Mickey Madden", "James Valentine", "Matt Flynn", "PJ Morton", "Sam Farrar", "Ryan Dusick"), c("Red Pill Blues + (Deluxe)", "v (Deluxe)", "Overexposed Track by Track", "Hands all over (Deluxe)", "it Won't be Soon Before Long.", "Songs About Jane"), c(), FALSE)
## Joining, by = "word"
maroon5Reg$Artist = "Maroon 5"

taylorSwiftReg = completeArchDf("Taylor Swift", c("Taylor Swift"), c("Reputation", "1989 (Deluxe Edition)", "Red (Deluxe Edition)", "Speak Now (Deluxe Edition)", "Fearless (Platinum Edition)", "Taylor Swift"), c(), FALSE )
## Joining, by = "word"
taylorSwiftReg$Artist = "Taylor Swift"

fooFightersReg = completeArchDf("Foo Fighters", c("Dave Grohl", "Nate Mendel", "Pat Smear", "Taylor Hawkins", "Chris Shiflett", "Rami Jaffee", "William Goldsmith", "Franz Stahl"), c("Wasting Light", "Echoes, Silence, Patience & Grace", "In your Honor", "One by One (Expanded Edition)", "There is Nothing Left to Lose", "The Colour and the Shape", "Concrete and Gold", "Foo Fighters", "Sonic Highways"), c(), FALSE)
## Joining, by = "word"
fooFightersReg$Artist = "Foo Fighters"

jColeReg = completeArchDf("j Cole", c("j Cole"), c("Revenge of the Dreamers Iii", "Kod", "2014 Forest Hills Drive", "Cole World: The Sideline Story", "4 your Eyez Only", "Born Sinner", "The Blow Up"), c(), FALSE)
## Joining, by = "word"
print(jColeReg)
##                       Name ReleaseDate                   Label
## 1         4 your Eyez Only  2016-12-09                    <NA>
## 2        Album of the Year  2018-08-07                    <NA>
## 3               Apparently  2014-12-09                    <NA>
## 4                      Atm  2018-04-20                    <NA>
## 5                 Brackets  2018-04-20                    <NA>
## 6         Can't Get Enough  2011-09-01                Columbia
## 7                   Change  2016-12-09                    <NA>
## 8            Crooked Smile  2013-06-04                Columbia
## 9                  Deja Vu  2016-12-09                    <NA>
## 10          Everybody Dies  2016-12-02                    <NA>
## 11          False Prophets  2016-12-01                    <NA>
## 12          Foldin Clothes  2016-12-09                    <NA>
## 13 For Whom the Bell Tolls  2016-12-09                    <NA>
## 14                 Friends  2018-04-20                    <NA>
## 15                Immortal  2016-12-09                    <NA>
## 16                   Intro  2018-04-20                    <NA>
## 17                   Intro  2018-04-20                    <NA>
## 18                   Intro  2018-04-20                    <NA>
## 19                   Intro  2018-04-20                    <NA>
## 20                   Intro  2018-04-20                    <NA>
## 21                   Intro  2018-04-20                    <NA>
## 22                   Intro  2018-04-20                    <NA>
## 23                   Intro  2018-04-20                    <NA>
## 24                   Intro  2018-04-20                    <NA>
## 25           Kevin's Heart  2018-04-20                    <NA>
## 26                     Kod  2018-04-20                    <NA>
## 27            Middle Child  2019-01-23 Dreamville / Roc Nation
## 28                  Motiv8  2018-04-20                    <NA>
## 29               Neighbors  2016-12-09                    <NA>
## 30          No Role Modelz  2014-12-09                Columbia
## 31        Nobody's Perfect  2012-02-07                    <NA>
## 32          Once an Addict  2018-04-20                    <NA>
## 33              Photograph  2018-04-20                    <NA>
## 34              Power Trip  2013-02-14                Columbia
## 35             The Cut Off  2018-04-20                    <NA>
## 36         Ville Mentality  2016-12-08                    <NA>
## 37              Wet Dreamz  2014-12-09                Columbia
## 38                 Who Dat  2010-05-31                    <NA>
## 39             Window Pain  2018-04-20                    <NA>
## 40                Work Out  2011-06-15   Roc Nation / Columbia
##                             Album        pop1       pop2     pop3     pop4
## 1                4 your Eyez Only  0.03448276 0.03448276 4.278054 4.278054
## 2                            <NA>  0.01149425 0.01149425 2.646175 2.646175
## 3         2014 Forest Hills Drive  1.93047180 0.22212162 3.763523 3.076983
## 4                             Kod  0.38752898 0.23544230 4.554929 3.715292
## 5                             Kod  0.03333333 0.03333333 4.264087 4.264087
## 6  Cole World: The Sideline Story  3.41457134 0.32037345 3.893859 3.609290
## 7                4 your Eyez Only  0.07171543 0.05966724 4.383276 3.639594
## 8                     Born Sinner  5.52643770 0.47061960 4.305416 3.980937
## 9                4 your Eyez Only  3.46526138 0.48440854 4.544358 3.571305
## 10                           <NA>  0.01754386 0.01754386 3.786460 3.786460
## 11                           <NA>  0.04778973 0.02927121 3.852273 2.972069
## 12               4 your Eyez Only  0.05374150 0.04353741 4.264087 2.697745
## 13               4 your Eyez Only  0.06498364 0.05423095 4.357990 3.224927
## 14                           <NA>  0.02173913 0.02173913 4.009150 4.009150
## 15               4 your Eyez Only  0.26925192 0.14506853 4.500920 3.220235
## 16 Cole World: The Sideline Story  0.01886792 0.01886792 3.873282 3.873282
## 17 Cole World: The Sideline Story  0.01886792 0.01886792 3.873282 3.873282
## 18 Cole World: The Sideline Story  0.01886792 0.01886792 3.873282 3.873282
## 19                            Kod  0.01886792 0.01886792 3.873282 3.873282
## 20                            Kod  0.01886792 0.01886792 3.873282 3.873282
## 21                            Kod  0.01886792 0.01886792 3.873282 3.873282
## 22        2014 Forest Hills Drive  0.01886792 0.01886792 3.873282 3.873282
## 23        2014 Forest Hills Drive  0.01886792 0.01886792 3.873282 3.873282
## 24        2014 Forest Hills Drive  0.01886792 0.01886792 3.873282 3.873282
## 25                           <NA>  0.33246960 0.18895264 4.533674 3.600024
## 26                            Kod  1.29504874 0.28727103 4.511958 3.699885
## 27    Revenge of the Dreamers Iii 20.88688824 2.16905996 4.575741 4.458565
## 28                            Kod  0.09483568 0.08075117 4.455509 3.930017
## 29               4 your Eyez Only  0.36849758 0.15365051 4.478473 3.652004
## 30        2014 Forest Hills Drive  7.11264959 0.48206565 4.175925 3.653336
## 31 Cole World: The Sideline Story  2.40644699 0.24984939 3.691376 2.901888
## 32                           <NA>  0.02127660 0.02127660 3.990834 3.990834
## 33                            Kod  0.09642857 0.08392857 4.467057 3.758165
## 34                    Born Sinner 16.53854529 1.00535499 4.407938 4.049711
## 35                           <NA>  0.03571429 0.03571429 4.291828 4.291828
## 36               4 your Eyez Only  0.06388889 0.05277778 4.345103 3.376024
## 37        2014 Forest Hills Drive  2.84362761 0.26636225 3.691376 3.073159
## 38                           <NA>  0.01075269 0.01075269 2.091864 2.091864
## 39                           <NA>  0.02439024 0.02439024 4.096010 4.096010
## 40 Cole World: The Sideline Story 18.44841715 1.13219629 4.478473 4.024805
##    nonBandMemberWriters lyricalComplexity musicalComplexity
## 1                     0          13.93350                 1
## 2                     3          13.97051                 1
## 3                     0          11.57687                 1
## 4                     2          13.13359                 1
## 5                     4          12.69375                 1
## 6                     1          12.81370                 1
## 7                     0          13.94342                 1
## 8                     1          13.28807                 1
## 9                     0          13.01123                 1
## 10                    0          14.27632                 1
## 11                    0          13.94164                 1
## 12                    0          11.90967                 1
## 13                    0          10.66431                 1
## 14                    0          14.11279                 1
## 15                    0          12.52378                 1
## 16                    0          17.98508                 1
## 17                    0          14.99689                 1
## 18                    0          16.08843                 1
## 19                    0          17.98508                 1
## 20                    0          14.99689                 1
## 21                    0          16.08843                 1
## 22                    0          17.98508                 1
## 23                    0          14.99689                 1
## 24                    0          16.08843                 1
## 25                    0          12.52765                 1
## 26                    0          13.36855                 1
## 27                    0          13.48261                 1
## 28                    0          13.18835                 1
## 29                    2          13.21251                 1
## 30                    0          12.69324                 1
## 31                    2          12.68357                 1
## 32                    0          13.90294                 1
## 33                    0          12.21598                 1
## 34                    0          12.52172                 1
## 35                    1          12.96982                 1
## 36                    0          11.50065                 1
## 37                    2          13.00303                 1
## 38                    4          13.60967                 1
## 39                    0          12.99910                 1
## 40                    6          11.72769                 1
##    totalComplexity
## 1         14.93350
## 2         14.97051
## 3         12.57687
## 4         14.13359
## 5         13.69375
## 6         13.81370
## 7         14.94342
## 8         14.28807
## 9         14.01123
## 10        15.27632
## 11        14.94164
## 12        12.90967
## 13        11.66431
## 14        15.11279
## 15        13.52378
## 16        18.98508
## 17        15.99689
## 18        17.08843
## 19        18.98508
## 20        15.99689
## 21        17.08843
## 22        18.98508
## 23        15.99689
## 24        17.08843
## 25        13.52765
## 26        14.36855
## 27        14.48261
## 28        14.18835
## 29        14.21251
## 30        13.69324
## 31        13.68357
## 32        14.90294
## 33        13.21598
## 34        13.52172
## 35        13.96982
## 36        12.50065
## 37        14.00303
## 38        14.60967
## 39        13.99910
## 40        12.72769
jColeReg$Artist = "J Cole"

justinTimberlakeReg = completeArchDf("Justin Timberlake", c("Justin Timberlake"), c("Justified", "Man of the Woods", "The 20/20 Experience - 2 of 2 (Deluxe)", "The 20/20 Experience (Deluxe Version)", "Futuresex/Lovesounds Deluxe Edition"), c(), FALSE)
## Joining, by = "word"
justinTimberlakeReg$Artist = "Justin Timberlake"


artistCompare = function(artistDfs){
  fullDf = bind_rows(artistDfs)
  artists = unique(fullDf$Artist)
  artist1 = paste(artists[-length(artists)], collapse = ", ")
  artist2 = tail(artists, n= 1)[[1]]
  popPlot = ggplot(fullDf, aes(x = ReleaseDate, y = pop1, color = Artist, shape = Artist)) + geom_smooth(method = "lm",se = FALSE)+ geom_point(alpha= 0.5) + labs(x = "Release Date", y = "Popularity Metric (AU)")
  compPlot= ggplot(fullDf, aes(x = ReleaseDate, y = lyricalComplexity, color = Artist, shape = Artist)) + geom_smooth(method = "lm", se = FALSE)+ geom_point(alpha= 0.5) + labs(x= "Release Date",y = "Lyrical Complexity Metric (AU)")
  infPlot = ggplot(fullDf, aes(x = ReleaseDate, y = nonBandMemberWriters, color = Artist, shape = Artist)) + geom_smooth(method = "lm", se = FALSE)+ geom_point(alpha= 0.5) + labs(x= "Release Date",y = "Number of Outside Writers")
  grid.arrange(popPlot, compPlot, infPlot, ncol = 2, nrow = 2, top = paste("Popularity, Lyrical Complexity, and Outside Influence of ", artist1, " and ", artist2, " Songs Over Time"))
  
}

artistCompare(list(maroon5Reg, fooFightersReg, justinTimberlakeReg))
## Warning: Removed 1 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_point).

artistCompare(list(fooFightersReg, jColeReg))

artistCompare(list(maroon5Reg, taylorSwiftReg))
## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).